For numerical data (meaning interval and ratio data), the main issue is how much precision to

record. Recording a numeric value to as many decimals as you have available is usually best. For

example, if a scale can measure body weight to the nearest tenth of a kilogram, record it in the

database to that degree of precision. You can always round off to the nearest kilogram later if you

want, but you can never “unround” a number to recover digits you didn’t record. So it’s best to

record values in your data from measurement instruments to the degree of precision provided.

Along the same lines, don’t group numerical data into intervals when recording it. If you know the age

to the nearest year, don’t record Age in 10-year intervals (such as 20 to 29, 30 to 39, 40 to 49, and so

on). You can always have the computer do that kind of grouping later, but you can never recover the

age in years if all you record is the decade.

Some statistical programs let you store numbers in different formats. The program may refer to these

different storage modes using arcane terms for short, long, or very long integers (whole numbers) or

single-precision (short) or double-precision (long) floating point (fractional) numbers. Each type has

its own limits, which may vary from one program to another or from one kind of computer to another.

For example, a short integer may be able to represent only whole numbers within the range from

, whereas a double-precision floating-point number could easily handle a number

like

. Excel has no trouble storing numerical data in any of these formats, so

to make these choices, it is best to study the statistical program you will use to analyze the data. That

way, you can make rules for storing the data in Excel that make it easy for you to analyze the data once

it is imported into the statistical program.

Following are issues to consider with respect to numerical variables in Excel:

Don’t put two numbers (such as a blood pressure reading of

mmHg) into one column of

data. Excel won’t complain about it, but it will treat it as text because of the embedded “/”, rather

than as numerical data. Instead, create two separate variables and enter each number into the

appropriate variable.

When recording multiple types of measurements (such as days, weeks, months, and years), use two

columns to record the data (such as time and type). In the first column, store the value of the

variable, and in the second column, store a code to indicate the type (such as 1 = days, 2 = weeks,

3 = months, and 4 = years). As an example, “3 weeks” would be entered as a 3 in the time

column and 2 in the type column.

Missing numerical data requires a little more thought than missing categorical data. Some

researchers use 99 (or 999, or 9999) to indicate a missing value in categorical data, but this

approach should not be used for numeric data (because the statistical program will see these

values as actual measured values, and not codes for missing data). The simplest technique for